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Abstract 

Goal, Scope and Background The weighting of environmental 
impacts and damages on the safeguard subjects Human Health, 
Ecosystems, and Resources is a significant step of full aggre¬ 
gated LCIA. Panel surveys have become a common approach in 
LCIA research to investigate the preferences of stakeholders on 
environmental impacts and damages. Despite the numerous stud¬ 
ies, the knowledge on how to elicit reliable weights is still poor 
and inconsistent. We present a questionnaire study with 58 en¬ 
vironmental science students to investigate so-called framing 
effects in panel surveys. 

Main Features. The study investigates the significance of differ¬ 
ent framings, which were provided by three references. In addi¬ 
tion, the significance of quantitative information provided in 
the questionnaire is tested. The references are (al) safeguard 
subjects without specified additional information, (a2) damages 
in Europe as they are perceived by the panelist, and (a3) quanti¬ 
fied scenarios derived from Eco-indicator99. All participants 
ranked and rated the importance of the safeguard subjects three 
times, once within each reference system. According to a test- 
of-scope study, quantitative information given to the panelist 
was varied. One level (bl) included data from the Ecoindicator99 
methodology, whereas the other group (b2) received data with 
significantly higher Human Health damages and lower Ecosys¬ 
tem damages, ceteris paribus. This design allows testing the in¬ 
fluence of quantitative data on the rating. 

Results. The weighting of the safeguard subjects (al) reveals 
that Human Health is considered a slightly more important safe¬ 
guard subject than Ecosystems. However, both are judged to be 
significantly more important than Resources. This picture 
changes for the references (a2) and (a3) where damages were 
weighted. For both references, the respondents rated damages 
to Ecosystems as most important followed by Resources and 
Human Health, showing by far the lowest weights. Therefore, 
the framing of the reference that was weighted played a signifi¬ 
cant role. The ratings of the subgroups (bl) and (b2) did not 
differ with respect to the importance of damages, though sub¬ 
stantially different quantitative information was given. 


Conclusions and Recommendations. The participants of the 
study were obviously insensitive with respect to quantitative in¬ 
formation provided. This raises three questions, which are dis¬ 
cussed. What is the mental model upon which respondents base 
their beliefs and values? Can we expect that 'more sophisticated 1 
subjects would respond differently? Which prerequisites should 
an empirical weighting procedure fulfill in order to incorporate 
numerical data? We propose different approaches for future pro¬ 
cedures in order to accurately analyze these questions. 

Keywords: Environmental damages; framing; life cycle impact 
assessment (LCIA); panel surveys; test-of-scope; valuation tasks; 
weighting 


Introduction 

Life Cycle Impact Assessment (LCIA) - a specific method of 
goal oriented, functionalistic evaluation - follows a three- 
step protocol of analytical decomposition. As a first step, 
the functional unit under investigation (i.e. the products or 
technological processes) is decomposed into several aspects, 
perspectives, or criteria relevant for evaluation. Then, each 
of these aspects, perspectives, or criteria is assessed sepa¬ 
rately in terms of every option or action alternative. Finally, 
a concluding synthesis process integrates the decomposed 
scores of the criteria for each option. According to the ISO 
14042 standard (ISO 2000), the decomposition step within 
LCIA is described as the 'selection of impact categories, cat¬ 
egory indicators and characterization models'. Separate as¬ 
sessment of each criterion is conducted through the two 
mandatory steps 'assignment of LCI results to the impact 
categories (classification)' and 'calculation of category indi¬ 
cator results (characterization)'. These calculations yield a 
collection of indicator scores (LCIA profile). For the com¬ 
position step, ISO 14042 suggests two ways of integrating 
the indicator scores. One way is a direct, intuitive, holistic 
interpretation based on the LCIA profiles. The second way 
consists of further aggregation by calculating one or a small 
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set of composite scores for each option by normalizing ('cal¬ 
culating the magnitude of category indicator results relative 
to reference information'), grouping, and weighting the in¬ 
dicator scores, for example by using a linear model. In the 
latter method, the interpretation includes the comparison of 
the composite scores attained for the different options. 

Thus, LCIA thinking relies strongly on the decision theo¬ 
retic framework. From a decision theoretic perspective, the 
LCIA procedure is a special case of so-called bootstrapping 
(Dawes 1971, Elstein et al. 1978). In this procedure, differ¬ 
ent experts work on different components of this analytic 
procedure at different stages in the process. LCIA provides 
the framework for this and serves as a knowledge integra¬ 
tion tool (Scholz et al. 2002), and aids in decision making 
(Werner et al. 2002) by organizing decomposition, assess¬ 
ment and composition steps. There is still a dispute over 
which of the two above described methods to use with the 
composition step. Aggregation is criticized by some for in¬ 
troducing societal values that are neither scientific nor inter¬ 
nationally harmonized (Owens 1999, Schmidt et al. 2002). 
We agree with the argument that values are ubiquitous at all 
stages of LCIA, (Hofstetter 1998, Hertwich et al. 2000, 
Werner et al. 2002) and that values are especially at play in 
the two ways of concluding evaluations mentioned above. 
From the perspective of psychological decision research, it 
is recommended that values are systematically taken into 
account in the composition step (somewhat contradicting 
the ISO recommendations for comparative assertions dis¬ 
closed to the public, ISO 2000). This is because - as we 
know from decision theory - judges tend to intuitively at¬ 
tribute equal importance to every category unless the com¬ 
position procedure is explicit, conscious, and controlled 
(Kleindorfer et al. 1993). The process of intuitively attribut¬ 
ing equal weights to every category is even encouraged by 
the ISO standard (for comparative assertions disclosed to 
the public); it suggests that "the comparison shall be con¬ 
ducted category indicator by category indicator". In addi¬ 
tion, graphical representations of LCIA profiles, where bars 
for different normalized category indicators are on the same 
plot, intuitively encourage the reader to add up the bars for 
an overall interpretation. Since there is no scientific ration¬ 
ale for such equality, we favor systematically evaluating the 
relative importance of the selected categories. 

The composition rule for an LCIA profile can follow ap¬ 
proaches other than the weighted sum approach, such as 
the verbal argumentative algorithm developed for the Ger¬ 
man Umweltbundesamt method (i.e. the German Environ¬ 
mental Protection Agency, (UBA 1995)), the mixing trian¬ 
gle method (Hofstetter et al. 1999), or any other of the 
various procedures derived from the framework of decision 
analysis (Seppala et al. 2002). All of these approaches allow 
values to be systematically included. 

Within LCIA, panel methods have already become a com¬ 
mon and seemingly accepted approach to investigating 
stakeholders' preferences and the weights they assign to en¬ 
vironmental categories 1 . Several surveys and panel work¬ 

1 For a more detailed discussion of weighting approaches and their ad¬ 
vantages and disadvantages, see (Hofstetter 1998, Finnveden 1999, 
Finnveden et al. 2002). 


shops have been conducted in order to get information about 
reasonable weights for environmental category indicators 
(Nagata et al. 1996, Puolamaa et al. 1996, Huppes et al. 
1997, Lindeijer 1997, Sangle et al. 1999, Seppala 1999, 
Virtanen et al. 1999, Harada et al. 2000, Mettier et al. 2000, 
Itsubo et al. 2003). We make a distinction between mid¬ 
point weighting and endpoint weighting. Most of these stud¬ 
ies have been conducted using midpoint indicators; in these 
approaches, the impact categories are subject to weighting. 
The studies of Mettier et al. (2000), Harada et al (2000) and 
Itsubo et al (2003) apply endpoint weighting, as the cat¬ 
egory indicators presented are expressed as damages to the 
category endpoints. 

Despite these numerous studies, knowledge is still limited 
on how to elicit weights reliably and about which biases, 
disturbing factors, and cognitive limits to take into account 
when measuring weights. The lack of knowledge is still par¬ 
ticularly strong in terms of how best to represent those im¬ 
pacts or damages being evaluated and how the information 
provided to panelists, both quantitative and qualitative, af¬ 
fects the weighting. 

The main goal of the survey presented is to investigate sali¬ 
ent methodological problems associated with panel studies 
in LCA and to invoke discussion about them. This can im¬ 
prove the weighting techniques and the interpretation of 
LCIA data, especially for damage-oriented methodologies, 
such as Eco-indicator 99 (Goedkoop et al. 1999). The Life 
Cycle Impact Assessment Programme of the UNEP/SETAC 
Life Cycle Initiative aims at providing guidance to users on 
how to derive consistent weighting procedures and sets of 
weighting factors for LCIA results (Jolliet et al. 2003). The 
methodological questions focused on in that survey may 
contribute to that aim. 

1 The Methodological Problems Focused on in the Survey 

One of our concerns is that many respondents participating 
in weighting panels do not assign appropriately quantified 
weights to the environmental categories to be judged. We 
suspect that their answers reflect instead general belief 
strengths or values associated with these categories. Panelists 
are supposed to ignore the variation or updating of quanti¬ 
tative information included in the weighting task. We refer 
to the principles of bounded rationality (Gigerenzer et al. 
1996, Todd et al. 2003), the arguments of the psychometric 
paradigm (Kahneman et al. 1982), and the constructive per¬ 
spective on the elicitation of monetary values (Gregory et 
al. 1993) to support our hypothesis that respondents are 
directed by qualitative issues they perceive and extract from 
the 'task-story' (which they then link to their personal expe¬ 
rience and worldview), rather than by the abstracted, nu¬ 
merical information provided. Therefore, when valuing en¬ 
vironmental damage categories in a survey, the framing of 
the valuation task is a major issue. There are three aspects 
to the framing of valuation tasks (Kahneman et al. 1981, 
Scholz 1987, Payne et al. 1992). Respondents can be influ¬ 
enced by the context in which a task is presented (e.g. within 
a political agency or for a scientific study), the emotional 
and cognitive associations elicited by the content, or the ref- 
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erence point chosen (e.g. a relative improvement to the cur¬ 
rent situation will be valued very differently from a down¬ 
turn in the situation). The main focus of the survey was to 
investigate what effect the information presented, particularly 
the qualitative and quantitative information, had on the pref¬ 
erences and weighting judgments of the participants. To in¬ 
vestigate the qualitative aspects of the information presented, 
we tested for differences when different cognitive references 
are used (see Section 2.1). To study the quantitative aspect 
of presentation of the information, we conducted similar 
tests for different data models (see Section 2.2). 

1.1 Valuation of different cognitive references: 

Safeguard subjects, perceived damages, and data models 

When constructing a weighting survey, one of the issues criti¬ 
cal for framing the valuation task is the method of present¬ 
ing the environmental categories to the panelists. For the El 
'99 project, several possible framings or reference systems 
were discussed. We compare the weighting of three differ¬ 
ent reference systems in this survey. Logically, the most sim¬ 
ple reference system for eliciting weights is based only on 
the definitions of the safeguard subjects. This is the first 
reference system, referred to as 'safeguard subjects'. With 
aid of this reference, we intend to elicit the intrinsic values 
that the respondents attribute to the concepts of the safe¬ 
guard subjects. 

From a methodological point of view, weights are used to 
aggregate the damage (or impact) category results for a prod¬ 
uct system. Besides the intrinsic value attributed to the con¬ 
cept, the weighting factors should also account for the scar¬ 
city of a safeguard subject, i.e. the effective damage situation. 
Therefore, a reference system representing damages (or im¬ 
pacts) would better match the models used in LCIA than 
one that only defines the safeguard subjects. In general, ref¬ 
erence systems consisting of damages can be introduced in 
two ways. One is to not introduce any data in the survey, 
but rather rely solelyon the panelist’s perception of and pre¬ 
vious knowledge about the damage categories. A second 
reference is referred to as 'perceived damages'. Another way 
is to introduce additional information and data about the 
environmental categories. In our study, we use damage in¬ 
formation derived from the characterization models of EI99. 
We refer to this third reference as 'data models'. 2 

In the questionnaire used for the El '99 survey, data on dam¬ 
ages in Europe were provided for Human Health Ecosys¬ 
tems and Resources (Mettier et al. 2004); that is, a data 
model framing was applied. For the present survey, there¬ 
fore, it is of special interest to investigate how panelists cope 
with other reference systems and what bias these other ref¬ 
erence systems may introduce. In a later section, we com¬ 
pare the influence that the three reference systems - safe¬ 
guard subjects, perceived damages, and data models - have 
on the individuals' judgments. 


2 This reference could also be labeled 'expected' or 'predicted damages' 
(Hofstetter 1999). In order to avoid a confusion between perceived and 
expected damages we chose the term 'data model'. 


1.2 Valuation of different data models: A test of scope 

The topic how to specify a reference and present it to the 
panelists is also discussed in applications of Multi-Attribute 
Utility Theory (MAUT) (Humphreys et al. 1989). In MAUT, 
the tradeoffs between the various criteria may not be based 
on comparison of the of the criteria themselves, but rather 
on reasonable changes in the values of the criteria (von 
Winterfeldt et al. 1986). The question to answer is not how 
important one criterion is relative to another, but how much 
change in one criterion a respondent is willing to trade off 
for how much change in another criterion at a certain point. 
This fact has been neglected in previous panel studies thus 
far, as the elicitation of weights has generally been based on 
the total load of one impact or damage category, rather than 
tradeoff rates which rely on criteria value changes. The valu¬ 
ation of the data models was therefore framed change ori¬ 
ented; the values reflect changes in the three damage catego¬ 
ries rather than the total damages. As the characterization 
models of EI99 are based on marginal modeling, one would 
expect marginal changes (e.g. the normalization values of 
EI99) as a reference for the weighting task. But respondents 
tend to have problems to weight small changes (Mettier & 
Hofstetter 2004). Therefore, we decided to use data on a 
larger scale (see Section 2.1). In contingent valuation (CV), 
there has been a heated debate over whether or not respond¬ 
ents are aware of the magnitude or scale of a weighted good. 
Therefore, so-called scope insensitivity is often discussed as 
a major source of bias in CV surveys, especially when the 
good is complex and tasks are unfamiliar (Fischhoff et al. 
1993). Several studies have revealed that the respondents 
were not aware of the amount of the valued good (Kahneman 
et al. 1992, Hanemann 1996, Frederick et al. 1998). A way 
to prove that respondents understand the amount of the good 
that is valued are tests of scope. In a test of scope, the sam¬ 
ple is split and a different version of a damage (or good) is 
presented to each half of the sample. To express this in terms 
of experimental social sciences, we investigate the effect of 
manipulating the basic information as an independent vari¬ 
able on the individuals’ weighting as a dependent variable. 

The damages presented differ only in the amount of the good 
that is damaged; the rest of the information remains the same. 
According to utility theory, a greater amount of damage 
should also be given a higher weight. Therefore, the results 
of the two versions of the survey can be compared and it 
can be tested whether higher amounts of damages result in 
higher weights. In CV, such tests of scope are taken as proofs 
of validity (Arrow et al. 1993). 

We chose to conduct such a test of scope for this study, so 
we made two versions of the questionnaire. In one of the 
versions, the damages reflect the European damages of the 
EI’99 report (Goedkoop et al. 1999), which are used as nor¬ 
malization values. In the other version, the damage to hu¬ 
man health is higher (five times as high) and the damage to 
the ecosystem is lower (half as high). Consequently, we ex¬ 
pected that the two versions of the questionnaire would yield 
different weighting factors. If this were to turn out to be 
true, we could assume that respondents are able to compre¬ 
hend the magnitude of the figures we provide. 
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2 Survey Design 

The sample consisted of 58 undergraduate students of Envi¬ 
ronmental Sciences at the ETH-Zurich. Although not ex¬ 
perts, the students in this sample had had some experience 
with environmental valuations, as all had participated in 
exercises on weighting attributes for a multi-criteria analy¬ 
sis prior to this survey. The questionnaire was distributed 
during a course in Environmental System Analysis that in¬ 
cluded an introduction to LCA. The students filled out the 
questionnaire at home. Participation was voluntary, took 
about 90 minutes, and was not required to pass the course. 
Two versions of the questionnaire were distributed. These 
versions only differed in the figures presented to the partici¬ 
pants in the data section (see below), which were varied to 
allow for a test of scope analysis. 

The questionnaire was structured into the following five sections: 

• Introduction and Personal Data: The survey was intro¬ 
duced, and questions pertaining to participants' age, gen¬ 
der, and knowledge of LCA were posed. 

• Definition and rating of the safeguard subjects: The con¬ 
cept of safeguard subjects in LCA was introduced and 
definitions of the safeguard subjects Human Health, Eco¬ 
systems, and Resources were given. After that, the re¬ 
spondents had to rank the importance of the safeguard 
subjects and rate them on a graphical scale ranging from 
0 (not important for LCA) to 100 (the most important 
safeguard subject for LCA). This data was used to evalu¬ 
ate the rating of the safeguard subjects, as the first cog¬ 
nitive reference described above. 

• Perception of damages in Europe: An introduction was 
given, outlining how impacts and damages to safeguard 
subjects within LCA can be used for environmental man¬ 
agement. Respondents were asked to state which safe¬ 
guard subject they thought was damaged most in Eu¬ 
rope and which was damaged least. Then, for every 
safeguard subject, they had to rate the seriousness of the 
damages in Europe on a five-point scale (from 'no seri¬ 
ous damages' to 'very serious damages'). 

• Valuation of damage indicators: The definitions of the 
damage indicators from Eco-indicator99 (Goedkoop et 
al. 1999) and the concepts behind them were introduced 
- DALYs 3 , PDF 4 , and surplus energy 5 in particular. A ref¬ 
erence scenario (see Section 3.3) was introduced and the 
students were asked to refer to this when assigning weights. 

• Questions concerning cultural theory and attitudes to¬ 
wards the environment: 18 questions were posed to meas¬ 
ure the level of agreement with the various cultural per¬ 
spectives (Marris et al. 1996) and attitudes towards the 
environment (Thompson et al. 1994, Siegrist 1996). 6 


3 DALYs (Disability Adjusted Life Years) are a concept about measuring 
damages to human health. These are used by the WHO and the World 
Bank, among others.) (Murray et al. 1996). 

4 In the Eco-indicaror99 methodology, PDF (the potentially disappeared frac¬ 
tion of species) is used as an indicator for damages to Ecosystem Quality. 

5 The concept of surplus energy is used within the Eco-indicator99 to ex¬ 
press damages to resource stocks of minerals and fossil fuels. 

6 Data from this section will not be discussed further, as the questions did 

not allow classification of the respondents into different groups. This was 
not possible as the students’ cultural perspectives and attitudes were too 
homogeneous, unlike the EI99 survey (Mettier & Hofstetter 2004), where 

the questions possessed good discriminative power. 


In order to avoid having the previous weightings bias the 
respondents, different scales were provided for each of the 
three weighting tasks (safeguard subjects, perceived dam¬ 
age in Europe, and the data model). This made it so that the 
respondents had to think freshly about every weighting and 
could not just state the same answer. To make it so, the 
ratings were still comparable, all of the weighting tasks first 
contained a ranking. 

2.1 The reference scenarios 

After rating the safeguard subjects and the damages in Eu¬ 
rope, which are two references that are not based on data, a 
damage scenario for a small region with 100,000 inhabitants 
was introduced to the panelists. Two versions were distrib¬ 
uted in order to conduct the test of scope. Version A is based 
on the normalization values of El 99 7 . Version B describes a 
region that has 5 times higher damages in Human Health, 
but half the damages in Ecosystems. This represents a region 
where certain densely populated areas are impacted strongly 
while wide expanses remain almost natural, as land is used 
extensively. Damage to the safeguard subject Resources is 
the same for A and B. The following data was presented: 

• Human Health: Damage is 800-1600 DALYs/a for Ver¬ 
sion A and 4000-6400 DALYs/a for version B. 

• Ecosystems: For Version A, an average of 45-55% of all 
species that could occur in a certain area are not found 8 . 
For Version B, this amount is 25-30%. 

• Resources: For versions A and B the surplus energy is 
600-800 Mio MJ/a (this is equal to the per capita [brutto- 
] energy use of 3800-5400 Swiss inhabitants). 

3 Results of the Survey 

A total of 109 questionnaires were distributed; 56 of Ver¬ 
sion A and 53 of Version B. Of these, 58 questionnaires 
were returned. That is a 53% rate of return, which is high 
for a survey. Of the returned questionnaires, 34 had Version 
A (normalization values of El 99) and 24 had Version B 
(modified damage data). 

First, we present a comparison between the valuation of the 
safeguard subjects and perceived damages in Europe. For 
the valuation of the safeguard subjects and the perceived 
damages in Europe, the two versions of the questionnaire 
were identical. Therefore, the results of all of the respond¬ 
ents are presented together. We address the question of 
whether the valuations for these references differ significantly 
(see Section 3.1). For the valuation of the scenarios, the ques¬ 
tionnaires contained different data models. Thus, the an¬ 
swers from the two versions must be compared in order to 
conduct a test of scope (see Section 3.2). 

7 The normalization values from the EI99 report have been multiplied by 
100,000 for the reasons described in section 2.2. The highest and the 
lowest normalization value from the three cultural perspectives were pre¬ 
sented in order to cover parts of the uncertainty. 

8 The damage to Ecosystems is described as the average potentially dam¬ 
aged fraction of species (PDF) for that region. The normalization values 
from the El 99 report (PDF*km 2 *yr/yr) were divided by the size of the 
region to get the PDF. In the questionnaire, a region of 100,000 inhabit¬ 
ants was chosen, containing 1/3800 of the reference population in El 99. 
Therefore, the size of the region chosen is 1/3800 of the reference area 
in El 99 (= 1000 km 2 ). 
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3.1 Valuation of safeguard subjects and perceived damages 

As a first valuation task, respondents ranked and rated the 
safeguard subjects according to importance. This task im¬ 
plies a trade-off between the intrinsic values respondents at¬ 
tribute to the safeguard subjects. The most important safe¬ 
guard subject was assigned a value of 100%. The safeguard 
subjects that ranked 2 nd and 3 rd had to be placed on a graphi¬ 
cal scale of 0% to 100%. The respondents were told that 
100% means 'equal importance', compared to the safeguard 
subject ranked '1 st ' and that 0% means the 'safeguard subject 
should not be included in LCA'. As is commonly done in LCA 
weighting, the figures were transformed into weights that add 
up to 1 (W HH + W E q + W R = 100%). Results from ranking and 
rating the safeguard subjects are shown in Table 1 and Fig. 1. 
A t-test reveals that the importance of the safeguard subjects 
Human Health and Ecosystems is rated significantly higher 
(p < .0001) than the importance of Resources. Human Health 
and Ecosystems are rated as equally important. 

In Fig. la, a box plot depicts the statistics for each safe¬ 
guard subject: the mean values (circle), the 95% confidence 
interval for the mean value (small T-line), the range that 
contains 50% of the values (box), the median (line across 
the box) and the extreme values (dotted T-line) 

Fig. lb shows a mixing triangle, which contains the assigned 
weights for all 56 participants that filled in the question¬ 
naire. Every cross marks the weights assigned by a partici- 


Table 1: Ranking of the safeguard subjects (n=56) 



Most 

important 
safeguard 
subject 
(in %) 

2nd most 
important 
safeguard 
subject 
(in %) 

Least 
important 
safeguard 
subject 
(in %) 

Human Health 

51.8% 

32.1% 

16.1% 

Ecosystems 1 

44.6% 

42.9% 

12.5% 

Resources 

3.6% 

25.0% 

71.4% 


For the questionnaire, the term Ecosystems was chosen instead of 
Ecosystem Quality (EQ). For clarity's sake, we use the terminology 
from the questionnaire in this chapter. 


pant. The dotted lines mark the center of the mixing trian¬ 
gle where every safeguard subject is weighted equally. Draw¬ 
ing similar lines from every cross to the three axes allows 
reading every participant's attached weights possible. 

For the next valuation task, we explained that some LCA 
methods result in damage indicators for the assessed prod¬ 
ucts and that weighting damages to the safeguard subject is 
relevant for LCA when comparing products. Then, the re¬ 
spondents had to rank and rate the damages in Europe that 
arise from anthropogenic influences. No further data was 
given; the participants had to rely on their own knowledge 
and experience (perceived damages). To do the rating, re¬ 
spondents made a selection by filling in a cross on a five- 
step scale ranging from 'damage low' to 'damage high'. For 
the statistical analysis, the answer categories were given a 
code from 1 (damage low) to 5 (damage high) 9 . These codes 
were again converted into weights that add up to 100%. 
The results of the ranking are shown in Table 2. The ratings 
are presented in Fig. 2. 

In the question about ranking, a majority responded that 
Ecosystems is the safeguard subject damaged the most in 
Europe and Human Health, the one damaged the least. Statis¬ 
tical analysis (t-test) of the rating data revealed significant 
distinctions in the rating of the perceived damages. Dam¬ 
ages to Ecosystems are rated higher than those to Resources 


9 The results are not sensitive to the chosen range for the codes. Whether 
codes are from 1 to 3 or from 1 to 9, the results of the statistical tests are 
the same. 


Table 2: Ranking of the perceived damages to the safeguard subjects in 
Europe 



Safeguard subject that 

Safeguard subject that 


is damaged the most 

is damaged the least 


(in %) 

(in %) 

Human Health 

5.4 

63.6 

Ecosystems 

71.4 

5.5 

Resources 

23.2 

30.9 



Fig. la&b: Weights assigned to the safeguard subjects Ecosystems, Human Health, and Resources 
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Fig. 2: Weights associated with the perceived damages in Europe (n=52). For explanations of the box plot or the mixing triangle, see Fig. 1 


(p < .001) or Human Health (p < .0001). Moreover, Re¬ 
sources are perceived as being more damaged than Human 
Health (p < .0001). 

3.2 Valuation of data models: Is there a difference between 
the two scenarios in a test of scope? 

In order to prepare for the valuation of the data models, we 
introduced the damage indicators from EI99. After that, the 
respondents had to decide whether or not they acceptedthese 
indicators. This was done in order to avoid the situation of 
respondents rating a damage category low, because they do 
not agree that the indicator represents a damage. The ac¬ 
ceptance was measured on a scale from 1 (poor indicator) 
to 5 (very good indicator). The acceptance of all three indi¬ 
cators was good, with an average of 3.3 for Human Health 
and 3.7 for Resources and Ecosystems. This meant that lack 
of acceptance of the indicators should not bias the valuation 
of the damages presented in the scenarios. 

After introducing the indicators, a scenario was presented, 
as described in Section 2.1. In order to conduct a test of 
scope, we already mentioned that the two versions of the 
survey that had been given to the respondents contained sig¬ 
nificant differences in their data on the damage to the re¬ 
gion. As a first valuation task, respondents had to choose 
between different reduction targets. The reduction targets 
were formulated such that a trade-off between the different 
damage categories had to be made. Such choice questions 
and assessment of tradeoff rates offer an interesting method 
for eliciting values on damage indicators. Moreover, the re¬ 
sults from these choice questions cannot be easily compared 
to the weights directly allocated. Therefore, we will present 
the method and the results in Part B of this paper (in an 
upcoming issue of Int J LCA). 

Following that, respondents had to allocate 50 reduction 
points among the three damages in order to maximize the 
resulting total reduction in environmental damage. The fig¬ 
ures given by the respondents were multiplied by 2 in order 
to get weights that add up to 100%. The data was analyzed 
using a t-test to compare results from the two versions. Based 


on the data presented, we expected that the respondents who 
filled in Version B would assign higher weights to Human 
Health and lower weights to Ecosystems, but no such differ¬ 
ence could be found. The p-values are far from being sig¬ 
nificant (p > .4 for all three damages). This finding can eas¬ 
ily be understood when looking at Table 3 where the weights 
are shown. The distribution of the weights is almost the same 
for both groups. This finding indicates that the data given 
had no influence on the weighting of the damages. The test 
of scope failed. We discuss the implications of this impor¬ 
tant finding further in the Conclusions section. 


Table 3: Average weights assigned to the two versions of the question¬ 
naire (in %) 



Weights 
Version A 

Std. 

Dev. 

Weights 
Version B 

Std: 

Dev. 

t-test 

(p-value) 

Human Health 

29.3 

15.2 

30.3 

15.6 

.42 

Ecosystems 

41.7 

12.6 

42.6 

12.6 

.72 

Resources 

29.0 

13.2 

27.1 

13.5 

.57 


3.3 Are there differences in the ratings of the three 
references? 

In order to compare the allocation of weights for the three 
references, we analyzed the data using t-tests. Because the 
weights attributed to the damage scenarios by the two groups 
did not differ, the data for this reference is analyzed for all 
of the respondents together. The average weights and the 
results of the tests are shown in Fig. 3. 

The analysis revealed that the safeguard subjects (intrinsic 
values) are rated significantly differently than the perceived 
damages in Europe or the damage scenarios. When rating 
the importance of the safeguard subjects, Human Health 
was the most important; but it is the least important when 
rating the damages in Europe or the data given in the sce¬ 
nario. In contrast, the importance of the safeguard subject 
Resources is rated the least important, but Resources is rated 
higher in terms of damages. This shift is also recognizable 
when testing the differences between the ratings of the three 
references statistically. Significant differences are found be- 
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Fig. 3: Ratings of the environmental categories for three different cognitive references (diagram) and statistics from a t-test (Table). The p-value denotes 
the probability that the mean rating of an environmental category is the same for the references compared. Significant differences in the rating (p< .05) are 
marked with *; highly significant (p< .001) with ** 


tween the rating of safeguard subjects and the rating of per¬ 
ceived damages; for Human Health, the differences are even 
greater (see the first column in Table 1). The same holds 
true for a comparative rating of safeguard subjects and the 
data model (see the second column in Table 1). As one can 
see, this shift in the ranking does not arise from a bias of the 
different scales used to elicit the weights, because the rank¬ 
ing task was the same for each reference. We can, therefore, 
conclude that panelists distinguish between the importance 
of a safeguard subject (intrinsic values) and the importance 
of perceived damages in Europe. This is important since, to 
weight the damage indicators, we need to elicit the panelists' 
assessment of the actual damages, rather than their assess¬ 
ment of the general importance of the safeguard subjects. 
The step-by-step presentation of the information did not al¬ 
low the order of the questions to be random 10 . Therefore, 
the order may be a source of bias (sequence effect), as the 
previous weighting of perceived damages in Europe may 
influence the final weighting of the data model. The sequence 
effect is not relevant for most panel surveys where only one 
reference is rated. 

When comparing the perceived damages in Europe to the 
data model, only Resources is rated significantly differently, 
whereas Human Health and Ecosystems are rated more or 
less the same. This is surprising, as the method of eliciting the 
weights - especially the scale used - was quite different. We 
conclude that the importance of the safeguard subjects is per¬ 
ceived to be different from the importance of the perceived 
damages or of the data model, while the ratings of the per¬ 
ceived damages in Europe and the data presented are similar. 


10 In a randomized questionnaire, all of the information must be provided 
before the valuation tasks are performed, because the two references 
referring to damages cannot be valued before the safeguard subjects 
are introduced. To avoid starting the questionnaire with a huge intro¬ 
duction, information was given step by step. 


4 Conclusions and Recommendations 

The study was based on damage categories. Therefore, the 
conclusions we draw may not apply to surveys based on 
indicators located earlier in the cause-effect chain. The most 
important finding of this study is that the respondents were 
not sensitive to numerical data that was presented as refer¬ 
ence information for weighting damage categories. The 
analysis indicates that they obviously were not able to take 
into account the magnitude of the data presented. It is evi¬ 
dent that the test of scope failed. This raises three questions. 
First, if the respondents are not influenced by the data that 
is presented in the questionnaire, what is the reference or 
mental model upon which respondents base their beliefs and 
values? Second, can we expect that 'more sophisticated' 11 
subjects would respond differently? Third, which prerequi¬ 
sites should an empirical weighting procedure for student 
samples or sophisticated decision makers fulfill in order to 
incorporate numerical data. 

With respect to the first question, it seems relevant that the 
weights from the valuation of the data model were similar to 
those that resulted from the valuation of personal beliefs about 
damages in Europe, although the method of measurement and 
context of the questions were very different. The general val¬ 
ues and beliefs about the damage categories which a respond¬ 
ent has prior to the survey seem to play a more important role 
than the data presented in the survey. Moreover, the qualita¬ 
tive aspects of the category that is valued seems to be more 
decisive than the figures specified in the questionnaire. This is 
supported by the finding that respondents rate the importance 
of safeguard subjects significantly differently than they do the 


11 Scholz (1987, p. 134) distinguishes between "experienced and sophis¬ 
ticated decision makers. By experienced decision makers we mean in¬ 
dividuals who either have performed a certain decision repeatedly or 
have experienced the information, e.g. have observed the concrete his¬ 
tory of the environmental damage indicators. By sophisticated decision 
makers, we mean individuals who possess the ability or knowledge to 
cope with a situation or find an appropriate solution for a problem." 


400 


Int J LCA 11 (6) 2006 
















LCA Methodology 


Environmental Damages in LCIA, Part 1 


damages to these safeguard subjects. Obviously, endeavors 
at explaining indicators for the damage categories and in¬ 
troducing numerical data about damages are not successful, 
especially if the sample does not consist of experts. Similar 
outcomes could be found in our former survey (Mettier & 
Hofstetter 2004), where two thirds of the respondents stated 
that they were not influenced by any figures presented. 

This result is in line with findings from risk communication 
(Scholz et al. 1993) and diagnostic decision making (Gigeren- 
zer 1999), which show that subjects and decision makers are 
very insensitive to numerical data, when this information can¬ 
not be linked to an individual's perceivable, experiential 
knowledge. Just changing abstract numbers is, thus, not suf¬ 
ficient for conveying changes to environmental reality. 

The second question concerns the participants' knowledge 
and skills. One could question to what extent the results 
depend on the sample. Is it likely that experienced environ¬ 
mental experts would incorporate the numerical data in the 
questionnaire in a 'better', (i.e. more sensitive) way? This 
question is difficult to address and cannot be answered with¬ 
out empirical research. Nevertheless, it is likely that there are 
only a few people who have extensive experience or the de¬ 
gree of sophistication to prioritize and weight environmental 
problems and categories appropriately. Cognitively represent¬ 
ing, integrating, and differentiating among statistical data rel¬ 
evant for LCA damage weighting seems to be - even for edu¬ 
cated scientists - a difficult task, and few possess sufficient 
aptitude for this. Our data suggest that people who are not 
used to such valuation tasks tend to report a general belief 
about the categories rather than assign weights that refer to 
the reference specified. Therefore, it is important to provide 
more careful, adequate preparation for the valuation task, so 
that the statement about the reference is more valid; other¬ 
wise, one may accept that the reported figures are general state¬ 
ments. Possible methods for introducing the valuation task 
more carefully include using focus groups or multi-round sur¬ 
veys in which respondents get feedback and have the oppor¬ 
tunity to change their opinion (Delphi methods). 

The latter issue is the core of the third question. How should 
questions about weighting be posed in further studies? Our 
study supports some conclusions for future weighting sur¬ 
veys about three aspects of framing: context, emotional and 
cognitive associations, and the reference point. Our point- 
of-view on the reference point is that the main goal of LCA 
studies - to reduce environmental consequences associated 
with product life cycles - already defines a reference point. 
The reference valued in weighting surveys should therefore 
be expressed as a defined reduction in an environmental 
category. Although we could not find major differences in 
weighting between (the total) perceived damages in Europe 
and the reduction of the damages in the local scenarios, we 
think that from a methodological point of view it is impor¬ 
tant to specify the reference as a relative change to the cat¬ 
egory, not the total of the category. 

For weighting surveys in LCIA, the values are elicited in 
order to interpret the importance of category indicator re¬ 
sults. This application already introduces major aspects of 
the context. Regarding the consistency of the LCIA models, 
the context should therefore be characterized by data describ¬ 


ing the consequences of an environmental impact or damage 
category in relative terms. Our study showed, in terms of re¬ 
sults, that (for non-experts) the qualitative aspects of this data 
might have more influence than the quantitative. 

Because qualitative information is better understood, it be¬ 
comes easier to focus increasingly on information about the 
model structure, i.e. the environmental problems integrated 
into a damage category. For example, every environmental 
problem that contributes to a damage category could be pre¬ 
sented to the participants, thereby potentially reducing so-called 
prominence or availability effects (Nisbett et al. 1980, van der 
Pligt et al. 1998). Availability effects are often reported in CV 
studies where a category gets higher values when it is valued 
on its own than when it is part of (or embedded into) another 
category. It is therefore probable that environmental problems 
integrated into a large category containing numerous prob¬ 
lems are systematically underestimated. In the EI99, the dam¬ 
age category Human Health contains six environmental prob¬ 
lems (respiratory effects, climate change, radioactive emis¬ 
sions. ..) while Resources only contains two (energetic and non- 
energetic resources). This suggests that the weighting of an 
environmental problem associated with Resources (e.g. ener¬ 
getic) may end up being too high as compared to the weight¬ 
ing of an environmental problem associated with Human 
Health (e.g. climate change). Therefore, the environmental 
problems integrated into a damage category may provide de¬ 
cisive information for enhancing the quality of a weighting 
procedure for damage categories. This conclusion is analo¬ 
gous to the 'out of sight, out of mind' bias in decision re¬ 
search (Fischhoff 1982, Kleindorfer et al. 1993), which im¬ 
plies that a category is considered to be less likely or less 
important, the fewer subcategories are explicitly listed. 

A gap may exist between two requirements the polled panel 
should meet: internal validity and external validity. The panel 
should be externally valid; that is, representative for a broad 
stakeholder group of LCA ( Mettier & Hofstetter (2004)). 
At the same time, the results should be internally valid, mean¬ 
ing that we really measure what we intend to measure. As 
only some experts can manage the data presented, there may 
be a conflict between internal and external validity. We there¬ 
fore see two ways of proceeding with further weighting sur¬ 
veys. One is to focus on internal validity and conduct com¬ 
prehensive multi-round expert procedures. These procedures 
are based on a lot of environmental information, like the 
data model in this study. The experts become experienced 
and, thus, experts as well, by repeated measurements. The 
experience and time necessary for comprehending this data 
restricts the selection of participants. Nevertheless, we do 
not expect that the limited number of experts in the field of 
LCIA and weighting restricts similar studies among experts. 
Another interesting approach would be to focus on external 
validity and poll a broader population's general beliefs about 
the environmental categories. For this procedure, one may 
choose a reference similar to the perceived damages in Eu¬ 
rope; that means not relying on data. The outcome wouldn’t 
match the specific category indicators used in LCIA as well, 
but would be more representative. 

As mentioned, we also examined the influence of different 
question formats on the weighting results. These findings 
will be presented in Part B of this paper. 
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